Search CORE

2,069 research outputs found

GHN-Q: Parameter Prediction for Unseen Quantized Convolutional Architectures via Graph Hypernetworks

Author: Wong Alexander
Yun Stone
Publication venue
Publication date: 18/08/2023
Field of study

Deep convolutional neural network (CNN) training via iterative optimization has had incredible success in finding optimal parameters. However, modern CNN architectures often contain millions of parameters. Thus, any given model for a single architecture resides in a massive parameter space. Models with similar loss could have drastically different characteristics such as adversarial robustness, generalizability, and quantization robustness. For deep learning on the edge, quantization robustness is often crucial. Finding a model that is quantization-robust can sometimes require significant efforts. Recent works using Graph Hypernetworks (GHN) have shown remarkable performance predicting high-performant parameters of varying CNN architectures. Inspired by these successes, we wonder if the graph representations of GHN-2 can be leveraged to predict quantization-robust parameters as well, which we call GHN-Q. We conduct the first-ever study exploring the use of graph hypernetworks for predicting parameters of unseen quantized CNN architectures. We focus on a reduced CNN search space and find that GHN-Q can in fact predict quantization-robust parameters for various 8-bit quantized CNNs. Decent quantized accuracies are observed even with 4-bit quantization despite GHN-Q not being trained on it. Quantized finetuning of GHN-Q at lower bitwidths may bring further improvements and is currently being explored.Comment: Updated Figure 1 and added additional results in Table 1. Initial extended abstract version accepted at Edge Intelligence Workshop 2022 for poster presentatio

arXiv.org e-Print Archive

GHN-QAT: Training Graph Hypernetworks to Predict Quantization-Robust Parameters of Unseen Limited Precision Neural Networks

Author: Wong Alexander
Yun Stone
Publication venue
Publication date: 24/09/2023
Field of study

Graph Hypernetworks (GHN) can predict the parameters of varying unseen CNN architectures with surprisingly good accuracy at a fraction of the cost of iterative optimization. Following these successes, preliminary research has explored the use of GHNs to predict quantization-robust parameters for 8-bit and 4-bit quantized CNNs. However, this early work leveraged full-precision float32 training and only quantized for testing. We explore the impact of quantization-aware training and/or other quantization-based training strategies on quantized robustness and performance of GHN predicted parameters for low-precision CNNs. We show that quantization-aware training can significantly improve quantized accuracy for GHN predicted parameters of 4-bit quantized CNNs and even lead to greater-than-random accuracy for 2-bit quantized CNNs. These promising results open the door for future explorations such as investigating the use of GHN predicted parameters as initialization for further quantized training of individual CNNs, further exploration of "extreme bitwidth" quantization, and mixed precision quantization schemes.Comment: Poster and extended abstract to be presented at the Workshop for Low Bit Quantized Neural Networks (LQBNN) @ ICCV 202

arXiv.org e-Print Archive

Where Should We Begin? A Low-Level Exploration of Weight Initialization Impact on Quantized Behaviour of Deep Neural Networks

Author: Wong Alexander
Yun Stone
Publication venue: 'University of Waterloo'
Publication date: 30/11/2020
Field of study

With the proliferation of deep convolutional neural network (CNN) algorithms for mobile processing, limited precision quantization has become an essential tool for CNN efficiency. Consequently, various works have sought to design fixed precision quantization algorithms and quantization-focused optimization techniques that minimize quantization induced performance degradation. However, there is little concrete understanding of how various CNN design decisions/best practices affect quantized inference behaviour. Weight initialization strategies are often associated with solving issues such as vanishing/exploding gradients but an often-overlooked aspect is their impact on the final trained distributions of each layer. We present an in-depth, fine-grained ablation study of the effect of different weights initializations on the final distributions of weights and activations of different CNN architectures. The fine-grained, layerwise analysis enables us to gain deep insights on how initial weights distributions will affect final accuracy and quantized behaviour. To our best knowledge, we are the first to perform such a low-level, in-depth quantitative analysis of weights initialization and its effect on quantized behaviour

arXiv.org e-Print Archive

Waterloo Library Journal Publishing Service (University of Waterloo, Canada)

Recommended from our members

Comparing Propensity Score Methods in Balancing Covariates and Recovering Impact in Small Sample Educational Program Evaluations

Author: Stone Clement A.
Tang Yun
Publication venue: ScholarWorks@UMass Amherst
Publication date: 25/11/2019
Field of study

Propensity score applications are often used to evaluate educational program impact. However, various options are available to estimate both propensity scores and construct comparison groups. This study used a student achievement dataset with commonly available covariates to compare different propensity scoring estimation methods (logistic regression, boosted regression, and Bayesian logistic regression) in combination with different methods for constructing comparison groups (nearest-neighbor matching, optimal matching, weighting) relative to balancing pre-existing differences and recovering a simulated treatment effect in small samples. Results indicated that applied researchers evaluating program impact should first consider use of standard logistic regression methods with nearest-neighbor or optimal matching or boosted regression in combination with propensity score weighting. Advantages and disadvantages of the methods are discussed. Accessed 12,046 times on https://pareonline.net from November 05, 2013 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right

ScholarWorks@UMass Amherst

An Analysis Framework for the Quantization-Aware Design of Efficient, Low-Power Convolutional Neural Networks

Author: Yun Stone
Publication venue: 'University of Waterloo'
Publication date: 22/04/2022
Field of study

Deep convolutional neural network (CNN) algorithms have emerged as a powerful tool for many computer vision tasks such as image classification, object detection, and semantic segmentation. However, these algorithms are computationally expensive and difficult to adapt for resource constrained environments. With the proliferation of CNNs for mobile, there is a growing need for methods to reduce their latency and power consumption. Furthermore, we would like a principled approach to the design and understanding of CNN model behaviour. Computationally efficient CNN architecture design and running inference with limited precision arithmetic (commonly referred to as neural network quantization) have become ubiquitous techniques for speeding up CNN inference speed and reducing their power consumption. This work describes a method for analyzing the quantized behaviour of efficient CNN architectures and subsequently leveraging those insights for quantization-aware design of CNN models. We introduce a framework for fine-grained, layerwise analysis of CNN models during and after training. We present an in-depth, fine-grained ablation approach to understanding the effect of different design choices on the layerwise distributions of weights and activations of CNNs. This layerwise analysis enables us to gain deep insights on how the interaction of training data, hyperparameters, and CNN architecture can ultimately affect quantized behaviour. Additionally, analysis of these distributions can yield additional insights on how information is propagating through the system. Various works have sought to design fixed precision quantization algorithms and optimization techniques that minimize quantization-induced performance degradation. However, to the best of our knowledge, there has not been any prior works focusing on a fine-grained analysis of why a given CNN's quantization behaviour is observed. We demonstrate the use of this framework in two contexts of quantization-aware model design. The first is a novel ablation study investigating the impact of random weight initialization on final trained distributions of different CNN architectures and resulting quantized accuracy. Next, we combine our analysis framework with a novel "progressive depth factorization" strategy for an iterative, systematic exploration of efficient CNN architectures under quantization constraints. We algorithmically increase the granularity of depth factorization in a progressive manner while observing the resulting change in layer-wise distributions. Thus, progressive depth factorization enables the gain of in-depth, layer-level insights on efficiency-accuracy tradeoffs. Coupling fine-grained analysis with progressive depth factorization frames our design in the context of quantized behaviour. Thus, it enables efficient identification of the optimal depth-factorized macroarchitecture design based on the desired efficiency-accuracy requirements under quantization

University of Waterloo's Institutional Repository

Hot Jupiter Magnetospheres

Author: Arras
Arras
Ben-Jaffel
Ben-Jaffel
Braginskii
Charbonneau
Feldman
Fossati
George B. Trammell
Gu
Hapke
Kopal
Koskinen
Lai
Lamers
Linsky
Mestel
Mestel
Mihalas
Murray
Murray-Clay
Okamoto
Osterbrock
Parks
Phil Arras
Schunk
Spruit
Stone
Sánchez-Lavega
Tian
Vidal-Madjar
Vidal-Madjar
Winn
Wood
Zhi-Yun Li
Publication venue: 'IOP Publishing'
Publication date: 29/10/2010
Field of study

(Abridged) The upper atmospheres of close-in gas giant exoplanets are subjected to intense heating/tidal forces from their parent stars. Atomic/ionized hydrogen (H) layers are sufficiently rarefied that magnetic pressure may dominate gas pressure for expected planetary magnetic field strength. We examine the magnetospheric structure using a 3D isothermal magnetohydrodynamic model that includes: a static "dead zone" near the magnetic equator containing magnetically confined gas; a "wind zone" outside the magnetic equator in which thermal pressure gradients and the magneto-centrifugal-tidal effect give rise to transonic outflow; and a region near the poles where sufficiently strong tidal forces may suppress transonic outflow. Using dipole field geometry, we estimate the size of the dead zone to be ~1-10 planetary radii for a range of parameters. To understand appropriate base conditions for the 3D isothermal model, we compute a 1D thermal model in which photoelectric heating from the stellar Lyman continuum is balanced by collisionally-excited Lyman {\alpha} cooling. This 1D model exhibits a H layer with temperatures T=5000-10000K down to pressures of 10-100 nbar. Using the 3D isothermal model, we compute H column densities and Lyman {\alpha} transmission spectra for parameters appropriate to HD 209458b. Line-integrated transit depths of 5-10% can be achieved for the above base conditions. Strong magnetic fields increase the transit signal while decreasing the mass loss, due to higher covering fraction and density of the dead zone. In our model, most of the transit signal arises from magnetically confined gas, some of which may be outside the L1 equipotential. Hence the presence of gas outside the L1 equipotential does not directly imply mass loss. Lastly, we discuss the domain of applicability for the magnetic wind model described in this paper and in the Roche-lobe overflow model.Comment: 26 pages, 17 figures (5 color), 2 appendices; submitted to ApJ; higher resolution version available at http://www.astro.virginia.edu/~gbt8f/HotJupMag_fullres_astroph.pd

arXiv.org e-Print Archive

Crossref

Reproductive factors associated with mammographic density: a Korean co-twin control study

Author: Lee Donghun
Lee Kayoung
Song Yun-Mi
Stone Jennifer
Sung Joohon
성주헌
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2011
Field of study

To determine the mechanism by which menstrual and reproductive factors are associated with the risk of breast cancer, we examined the relationships between mammographic density and known menstrual and reproductive risk factors for breast cancer. A co-twin control study was conducted with 122 pairs of monozygotic Korean female twins selected from the Healthy Twin study. Mammographic density was measured from digital mammograms using a computer-assisted method. Information on selected menstrual and reproductive factors was collected through a self-administered questionnaire. Within-pair differences for each mammographic measure were regressed against within-pair differences for each menstrual and reproductive risk factor with an adjustment for body mass index and other menstrual and reproductive factors. The percent dense area was inversely associated with the age at the first full-term childbirth (FFTB) and the number of live births, although the associations were marginally significant with an adjustment for BMI and other reproductive factors. The non-dense area was positively associated with the age at the FFTB and the number of live births. The absolute dense area was positively associated with the duration of breast feeding. The age at menarche was not associated with any component of the mammographic measures. This finding suggests that mammographic density can mediate the protective effect of greater parity against breast cancer, at least in part while age at menarche, age at the FFTB, and breast feeding do not exert their effects through mammographic density.OAIID:oai:osos.snu.ac.kr:snu2011-01/102/0000040632/4SEQ:4PERF_CD:SNU2011-01EVAL_ITEM_CD:102USER_ID:0000040632ADJUST_YN:YEMP_ID:A077602DEPT_CD:902CITE_RATE:4.431FILENAME:reproductive factors associated with mammographic density a korean co-twin control study.pdfDEPT_NM:보건학과SCOPUS_YN:YCONFIRM:

SNU Open Repository and Archive

Water dispersible microbicidal cellulose acetate phthalate film

BACKGROUND: Cellulose acetate phthalate (CAP) has been used for several decades in the pharmaceutical industry for enteric film coating of oral tablets and capsules. Micronized CAP, available commercially as "Aquateric" and containing additional ingredients required for micronization, used for tablet coating from water dispersions, was shown to adsorb and inactivate the human immunodeficiency virus (HIV-1), herpesviruses (HSV) and other sexually transmitted disease (STD) pathogens. Earlier studies indicate that a gel formulation of micronized CAP has a potential as a topical microbicide for prevention of STDs including the acquired immunodeficiency syndrome (AIDS). The objective of endeavors described here was to develop a water dispersible CAP film amenable to inexpensive industrial mass production. METHODS: CAP and hydroxypropyl cellulose (HPC) were dissolved in different organic solvent mixtures, poured into dishes, and the solvents evaporated. Graded quantities of a resulting selected film were mixed for 5 min at 37°C with HIV-1, HSV and other STD pathogens, respectively. Residual infectivity of the treated viruses and bacteria was determined. RESULTS: The prerequisites for producing CAP films which are soft, flexible and dispersible in water, resulting in smooth gels, are combining CAP with HPC (other cellulose derivatives are unsuitable), and casting from organic solvent mixtures containing ≈50 to ≈65% ethanol (EtOH). The films are ≈100 µ thick and have a textured surface with alternating protrusions and depressions revealed by scanning electron microscopy. The films, before complete conversion into a gel, rapidly inactivated HIV-1 and HSV and reduced the infectivity of non-viral STD pathogens >1,000-fold. CONCLUSIONS: Soft pliable CAP-HPC composite films can be generated by casting from organic solvent mixtures containing EtOH. The films rapidly reduce the infectivity of several STD pathogens, including HIV-1. They are converted into gels and thus do not have to be removed following application and use. In addition to their potential as topical microbicides, the films have promise for mucosal delivery of pharmaceuticals other than CAP

Crossref

Springer - Publisher Connector

PubMed Central

Iron bioavailability in two commercial cultivars of wheat: a comparison between wholegrain and white flour and the effects of nicotianamine and 2'-deoxymugineic acid on iron uptake into Caco-2 cells

Author: Anna A. Wawer
Ariza-Nieto M.
Beta T.
Bouis H. E.
Cakmak I.
Cheng L.
Eagling T.
Engle-Stone R.
Fang-Jie Zhao
Fernandez-Orozco R.
Garcia M. N.
Glahn R. P.
Glahn R. P.
Glahn R. P.
Glahn R. P.
Glahn R. P.
Haas J. D.
Higuchi K.
Hurrell R.
Kalgaonkar S.
Lee S.
Lipschitz D. A.
Mattila P.
McLean E.
Mino Y.
Morris E. R.
Ortiz-Monasterio J. I.
Oury F.-X.
Persson D. P.
Peter R. Shewry
Qureshi I. M.
Sandberg A. S.
Schlemmer U.
Scientific Advisory Committee on Nutrition
Shewry P. R.
Shojima S.
Susan J. Fairweather-Tait
Thavarajah D.
Thompson B.
Thompson B. A. V.
Tristan Eagling
Velu G.
Wawer A. A.
White P. J.
Yun S.
Zhang Y.
Zhao F. J.
Zheng L.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2014
Field of study

Iron bioavailability in unleavened white and wholegrain bread made from two commercial wheat varieties was assessed by measuring ferritin production in Caco-2 cells. The breads were subjected to simulated gastrointestinal digestion and the digests applied to the Caco-2 cells. Although Riband grain contained a lower iron concentration than Rialto, iron bioavailability was higher. No iron was taken up by the cells from white bread made from Rialto flour or from wholegrain bread from either variety, but Riband white bread produced a small ferritin response. The results probably relate to differences in phytate content of the breads, although iron in soluble monoferric phytate was demonstrated to be bioavailable in the cell model. Nicotianamine, an iron chelator in plants involved in iron transport, was a more potent enhancer of iron uptake into Caco-2 cells than ascorbic acid or 2'-deoxymugineic acid, another metal chelator present in plants

Crossref

Adelaide Research & Scholarship

University of East Anglia digital repository

Rothamsted Repository